Study on Preprocessing and Classifying Mass Spectral Raw Data Concerning Human Normal and Disease Cases

نویسندگان

  • Xenofon E. Floros
  • George M. Spyrou
  • Konstantinos N. Vougas
  • George T. Tsangaris
  • Konstantina S. Nikita
چکیده

Mass spectrometry is becoming an important tool in biological sciences. Tissue samples or easily obtained biological fluids (serum, plasma, urine) are analysed by a variety of mass spectrometry methods, producing spectra characterized by very high dimensionality and a high level of noise. Here we address a feature exraction method for mass spectra which consists of two main steps : In the first step an algorithm for low level preprocessing of mass spectra is applied, including denoising with the Shift-Invariant Discrete Wavelet Transform (SIDWT), smoothing, baseline correction, peak detection and normalization of the resulting peak-lists. After this step, we claim to have reduced dimensionality and redundancy of the initial mass spectra representation while keeping all the meaningful features (potential biomarkers) required for disease related proteomic patterns to be identified. In the second step, the peak-lists are alligned and fed to a Support Vector Machine (SVM) which classifies the mass spectra. This procedure was applied to SELDIQqTOF spectral data collected from normal and ovarian cancer serum samples. The classification performance was assessed for distinct values of the parameters involved in the feature extraction pipeline. The method described here for low-level preprocessing of mass spectra results in 98.3% sensitivity, 98.3% specificity and an AUC (Area Under Curve) of 0.981 in spectra classification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining

This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...

متن کامل

Synthesis and spectral characterization of naphthyldihydrazones derived from some 1,3- dicarbonyl compounds and their Ni(II), Cu(II) and Zn(II) complexes

The coupling of tetrazotised 1,8-diaminonaphthalene with 1,3-dicarbonyl compounds [acetylacetone,methylacetoacetate and acetoacetanilide] yielded a new series of tetradentate ligand systems. Analytical, IR,1H NMR and mass spectral data indicate that the compounds exist in the intramolecularly hydrogen bondeddihydrazone form. Dibasic tetradentate N2O2 coordination of these compounds in their [ML...

متن کامل

Theoretical study of structure spectral properties of Tacrine as Alzheimer drug

Tacrine (9-amino-1,2,3,4-tetrahydroacridine) as a reversible inhibitor of acetylcholinesterase (AChE),was the first drug for the symptomatic treatment of Alzheimer’s disease (AD). NMR structuredetermination still presents some considerable challenges: the method is limited to systems ofrelatively small molecular mass, data collection times are long, data analysis remains a lengthyprocedure, and...

متن کامل

Discrimination of Human Cell Lines by Infrared Spectroscopy and Mathematical Modeling

Variations in biochemical features are extensive among cells. Identification of marker that is specific for each cell is essential for following the differentiation of stem cell and metastatic growing. Fourier transform infrared spectroscopy (FTIR) as a biochemical analysis more focused on diagnosis of cancerous cells. In this study, commercially obtained cell lines such as Human ovarian carcin...

متن کامل

Discrimination of Human Cell Lines by Infrared Spectroscopy and Mathematical Modeling

Variations in biochemical features are extensive among cells. Identification of marker that is specific for each cell is essential for following the differentiation of stem cell and metastatic growing. Fourier transform infrared spectroscopy (FTIR) as a biochemical analysis more focused on diagnosis of cancerous cells. In this study, commercially obtained cell lines such as Human ovarian carcin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006